Exploiting Transformer in Sparse Reward Reinforcement Learning for Interpretable Temporal Logic Motion Planning
نویسندگان
چکیده
Automaton based approaches have enabled robots to perform various complex tasks. However, most existing automaton algorithms highly rely on the manually customized representation of states for considered task, limiting its applicability in deep reinforcement learning algorithms. To address this issue, by incorporating Transformer into learning, we develop a Double-Transformer-guided Temporal Logic framework (T2TL) that exploits structural feature twice, i.e., first encoding LTL instruction via module efficient understanding task instructions during training and then context variable again improved performance. Particularly, is specified co-safe LTL. As semantics-preserving rewriting operation, progression exploited decompose learnable sub-goals, which not only converts non-Markovian reward decision processes Markovian ones, but also improves sampling efficiency simultaneous multiple sub-tasks. An environment-agnostic pre-training scheme further incorporated facilitate resulting an The simulation results demonstrate effectiveness T2TL framework.
منابع مشابه
Interpretable Apprenticship Learning with Temporal Logic Specifications
Recent work has addressed using formulas in linear temporal logic (LTL) as specifications for agents planning in Markov Decision Processes (MDPs). We consider the inverse problem: inferring an LTL specification from demonstrated behavior trajectories in MDPs. We formulate this as a multiobjective optimization problem, and describe state-based (“what actually happened”) and action-based (“what t...
متن کاملSparse Coding for Learning Interpretable Spatio-Temporal Primitives
Sparse coding has recently become a popular approach in computer vision to learn dictionaries of natural images. In this paper we extend the sparse coding framework to learn interpretable spatio-temporal primitives. We formulated the problem as a tensor factorization problem with tensor group norm constraints over the primitives, diagonal constraints on the activations that provide interpretabi...
متن کاملProgrammatically Interpretable Reinforcement Learning
We study the problem of generating interpretable and verifiable policies through reinforcement learning. Unlike the popular Deep Reinforcement Learning (DRL) paradigm, in which the policy is represented by a neural network, the aim in Programmatically Interpretable Reinforcement Learning (PIRL) is to find a policy that can be represented in a high-level programming language. Such programmatic p...
متن کاملTemporal logic motion planning for dynamic robots
In this paper, we address the temporal logic motion planning problem for mobile robots that are modeled by second order dynamics. Temporal logic specifications can capture the usual control specifications such as reachability and invariance as well as more complex specifications like sequencing and obstacle avoidance. Our approach consists of three basic steps. First, we design a control law th...
متن کاملRobust Reinforcement Learning in Motion Planning
While exploring to find better solutions, an agent performing online reinforcement learning (RL) can perform worse than is acceptable. In some cases, exploration might have unsafe, or even catastrophic, results, often modeled in terms of reaching 'failure' states of the agent's environment. This paper presents a method that uses domain knowledge to reduce the number of failures during explorati...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE robotics and automation letters
سال: 2023
ISSN: ['2377-3766']
DOI: https://doi.org/10.1109/lra.2023.3290511